Search Results: "cate"

20 February 2024

Niels Thykier: Language Server (LSP) support for debian/control

About a month ago, Otto Kek l inen asked for editor extensions for debian related files on the debian-devel mailing list. In that thread, I concluded that what we were missing was a "Language Server" (LSP) for our packaging files. Last week, I started a prototype for such a LSP for the debian/control file as a starting point based on the pygls library. The initial prototype worked and I could do very basic diagnostics plus completion suggestion for field names.
Current features I got 4 basic features implemented, though I have only been able to test two of them in emacs.
  • Diagnostics or linting of basic issues.
  • Completion suggestions for all known field names that I could think of and values for some fields.
  • Folding ranges (untested). This feature enables the editor to "fold" multiple lines. It is often used with multi-line comments and that is the feature currently supported.
  • On save, trim trailing whitespace at the end of lines (untested). Might not be registered correctly on the server end.
Despite its very limited feature set, I feel editing debian/control in emacs is now a much more pleasant experience. Coming back to the features that Otto requested, the above covers a grand total of zero. Sorry, Otto. It is not you, it is me.
Completion suggestions For completion, all known fields are completed. Place the cursor at the start of the line or in a partially written out field name and trigger the completion in your editor. In my case, I can type R-R-R and trigger the completion and the editor will automatically replace it with Rules-Requires-Root as the only applicable match. Your milage may vary since I delegate most of the filtering to the editor, meaning the editor has the final say about whether your input matches anything. The only filtering done on the server side is that the server prunes out fields already used in the paragraph, so you are not presented with the option to repeat an already used field, which would be an error. Admittedly, not an error the language server detects at the moment, but other tools will. When completing field, if the field only has one non-default value such as Essential which can be either no (the default, but you should not use it) or yes, then the completion suggestion will complete the field along with its value. This is mostly only applicable for "yes/no" fields such as Essential and Protected. But it does also trigger for Package-Type at the moment. As for completing values, here the language server can complete the value for simple fields such as "yes/no" fields, Multi-Arch, Package-Type and Priority. I intend to add support for Section as well - maybe also Architecture.
Diagnostics On the diagnostic front, I have added multiple diagnostics:
  • An error marker for syntax errors.
  • An error marker for missing a mandatory field like Package or Architecture. This also includes Standards-Version, which is admittedly mandatory by policy rather than tooling falling part.
  • An error marker for adding Multi-Arch: same to an Architecture: all package.
  • Error marker for providing an unknown value to a field with a set of known values. As an example, writing foo in Multi-Arch would trigger this one.
  • Warning marker for using deprecated fields such as DM-Upload-Allowed, or when setting a field to its default value for fields like Essential. The latter rule only applies to selected fields and notably Multi-Arch: no does not trigger a warning.
  • Info level marker if a field like Priority duplicates the value of the Source paragraph.
Notable omission at this time:
  • No errors are raised if a field does not have a value.
  • No errors are raised if a field is duplicated inside a paragraph.
  • No errors are used if a field is used in the wrong paragraph.
  • No spellchecking of the Description field.
  • No understanding that Foo and X[CBS]-Foo are related. As an example, XC-Package-Type is completely ignored despite being the old name for Package-Type.
  • Quick fixes to solve these problems... :)
Trying it out If you want to try, it is sadly a bit more involved due to things not being uploaded or merged yet. Also, be advised that I will regularly rebase my git branches as I revise the code. The setup:
  • Build and install the deb of the main branch of pygls from https://salsa.debian.org/debian/pygls The package is in NEW and hopefully this step will soon just be a regular apt install.
  • Build and install the deb of the rts-locatable branch of my python-debian fork from https://salsa.debian.org/nthykier/python-debian There is a draft MR of it as well on the main repo.
  • Build and install the deb of the lsp-support branch of debputy from https://salsa.debian.org/debian/debputy
  • Configure your editor to run debputy lsp debian/control as the language server for debian/control. This is depends on your editor. I figured out how to do it for emacs (see below). I also found a guide for neovim at https://neovim.io/doc/user/lsp. Note that debputy can be run from any directory here. The debian/control is a reference to the file format and not a concrete file in this case.
Obviously, the setup should get easier over time. The first three bullet points should eventually get resolved by merges and upload meaning you end up with an apt install command instead of them. For the editor part, I would obviously love it if we can add snippets for editors to make the automatically pick up the language server when the relevant file is installed.
Using the debputy LSP in emacs The guide I found so far relies on eglot. The guide below assumes you have the elpa-dpkg-dev-el package installed for the debian-control-mode. Though it should be a trivially matter to replace debian-control-mode with a different mode if you use a different mode for your debian/control file. In your emacs init file (such as ~/.emacs or ~/.emacs.d/init.el), you add the follow blob.
(with-eval-after-load 'eglot
    (add-to-list 'eglot-server-programs
        '(debian-control-mode . ("debputy" "lsp" "debian/control"))))
Once you open the debian/control file in emacs, you can type M-x eglot to activate the language server. Not sure why that manual step is needed and if someone knows how to automate it such that eglot activates automatically on opening debian/control, please let me know. For testing completions, I often have to manually activate them (with C-M-i or M-x complete-symbol). Though, it is a bit unclear to me whether this is an emacs setting that I have not toggled or something I need to do on the language server side.
From here As next steps, I will probably look into fixing some of the "known missing" items under diagnostics. The quick fix would be a considerable improvement to assisting users. In the not so distant future, I will probably start to look at supporting other files such as debian/changelog or look into supporting configuration, so I can cover formatting features like wrap-and-sort. I am also very much open to how we can provide integrations for this feature into editors by default. I will probably create a separate binary package for specifically this feature that pulls all relevant dependencies that would be able to provide editor integrations as well.

Niels Thykier: Language Server (LSP) support for debian/control

Work done:
  • [X] No errors are raised if a field does not have a value.
  • [X] No errors are raised if a field is duplicated inside a paragraph.
  • [X] No errors are used if a field is used in the wrong paragraph.
  • [ ] No spellchecking of the Description field.
  • [X] No understanding that Foo and X[CBS]-Foo are related. As an example, XC-Package-Type is completely ignored despite being the old name for Package-Type.
  • [X] Fixed the on-save trim end of line whitespace. Bug in the server end.
  • [X] Hover text for field names

19 February 2024

Matthew Garrett: Debugging an odd inability to stream video

We have a cabin out in the forest, and when I say "out in the forest" I mean "in a national forest subject to regulation by the US Forest Service" which means there's an extremely thick book describing the things we're allowed to do and (somewhat longer) not allowed to do. It's also down in the bottom of a valley surrounded by tall trees (the whole "forest" bit). There used to be AT&T copper but all that infrastructure burned down in a big fire back in 2021 and AT&T no longer supply new copper links, and Starlink isn't viable because of the whole "bottom of a valley surrounded by tall trees" thing along with regulations that prohibit us from putting up a big pole with a dish on top. Thankfully there's LTE towers nearby, so I'm simply using cellular data. Unfortunately my provider rate limits connections to video streaming services in order to push them down to roughly SD resolution. The easy workaround is just to VPN back to somewhere else, which in my case is just a Wireguard link back to San Francisco.

This worked perfectly for most things, but some streaming services simply wouldn't work at all. Attempting to load the video would just spin forever. Running tcpdump at the local end of the VPN endpoint showed a connection being established, some packets being exchanged, and then nothing. The remote service appeared to just stop sending packets. Tcpdumping the remote end of the VPN showed the same thing. It wasn't until I looked at the traffic on the VPN endpoint's external interface that things began to become clear.

This probably needs some background. Most network infrastructure has a maximum allowable packet size, which is referred to as the Maximum Transmission Unit or MTU. For ethernet this defaults to 1500 bytes, and these days most links are able to handle packets of at least this size, so it's pretty typical to just assume that you'll be able to send a 1500 byte packet. But what's important to remember is that that doesn't mean you have 1500 bytes of packet payload - that 1500 bytes includes whatever protocol level headers are on there. For TCP/IP you're typically looking at spending around 40 bytes on the headers, leaving somewhere around 1460 bytes of usable payload. And if you're using a VPN, things get annoying. In this case the original packet becomes the payload of a new packet, which means it needs another set of TCP (or UDP) and IP headers, and probably also some VPN header. This still all needs to fit inside the MTU of the link the VPN packet is being sent over, so if the MTU of that is 1500, the effective MTU of the VPN interface has to be lower. For Wireguard, this works out to an effective MTU of 1420 bytes. That means simply sending a 1500 byte packet over a Wireguard (or any other VPN) link won't work - adding the additional headers gives you a total packet size of over 1500 bytes, and that won't fit into the underlying link's MTU of 1500.

And yet, things work. But how? Faced with a packet that's too big to fit into a link, there are two choices - break the packet up into multiple smaller packets ("fragmentation") or tell whoever's sending the packet to send smaller packets. Fragmentation seems like the obvious answer, so I'd encourage you to read Valerie Aurora's article on how fragmentation is more complicated than you think. tl;dr - if you can avoid fragmentation then you're going to have a better life. You can explicitly indicate that you don't want your packets to be fragmented by setting the Don't Fragment bit in your IP header, and then when your packet hits a link where your packet exceeds the link MTU it'll send back a packet telling the remote that it's too big, what the actual MTU is, and the remote will resend a smaller packet. This avoids all the hassle of handling fragments in exchange for the cost of a retransmit the first time the MTU is exceeded. It also typically works these days, which wasn't always the case - people had a nasty habit of dropping the ICMP packets telling the remote that the packet was too big, which broke everything.

What I saw when I tcpdumped on the remote VPN endpoint's external interface was that the connection was getting established, and then a 1500 byte packet would arrive (this is kind of the behaviour you'd expect for video - the connection handshaking involves a bunch of relatively small packets, and then once you start sending the video stream itself you start sending packets that are as large as possible in order to minimise overhead). This 1500 byte packet wouldn't fit down the Wireguard link, so the endpoint sent back an ICMP packet to the remote telling it to send smaller packets. The remote should then have sent a new, smaller packet - instead, about a second after sending the first 1500 byte packet, it sent that same 1500 byte packet. This is consistent with it ignoring the ICMP notification and just behaving as if the packet had been dropped.

All the services that were failing were failing in identical ways, and all were using Fastly as their CDN. I complained about this on social media and then somehow ended up in contact with the engineering team responsible for this sort of thing - I sent them a packet dump of the failure, they were able to reproduce it, and it got fixed. Hurray!

(Between me identifying the problem and it getting fixed I was able to work around it. The TCP header includes a Maximum Segment Size (MSS) field, which indicates the maximum size of the payload for this connection. iptables allows you to rewrite this, so on the VPN endpoint I simply rewrote the MSS to be small enough that the packets would fit inside the Wireguard MTU. This isn't a complete fix since it's done at the TCP level rather than the IP level - so any large UDP packets would still end up breaking)

I've no idea what the underlying issue was, and at the client end the failure was entirely opaque: the remote simply stopped sending me packets. The only reason I was able to debug this at all was because I controlled the other end of the VPN as well, and even then I wouldn't have been able to do anything about it other than being in the fortuitous situation of someone able to do something about it seeing my post. How many people go through their lives dealing with things just being broken and having no idea why, and how do we fix that?

(Edit: thanks to this comment, it sounds like the underlying issue was a kernel bug that Fastly developed a fix for - under certain configurations, the kernel fails to associate the MTU update with the egress interface and so it continues sending overly large packets)

comment count unavailable comments

13 February 2024

Arturo Borrero Gonz lez: Back to the Wikimedia Foundation!

Wikimedia Foundation logo In October 2023, I departed from the Wikimedia Foundation, the non-profit organization behind well-known projects like Wikipedia and others, to join Spryker. However, in January 2024 Spryker conducted a round of layoffs reportedly due to budget and business reasons. I was among those affected, being let go just three months after joining the company. Fortunately, the Wikimedia Cloud Services team, where I previously worked, was still seeking to backfill my position, so I reached out to them. They graciously welcomed me back as a Senior Site Reliability Engineer, in the same team and position as before. Although this three-month career detour wasn t the outcome I initially envisioned, I found it to be a valuable experience. During this time, I gained knowledge in a new tech stack, based on AWS, and discovered new engineering methodologies. Additionally, I had the opportunity to meet some wonderful individuals. I believe I have emerged stronger from this experience. Returning to the Wikimedia Foundation is truly motivating. It feels privileged to be part of this mature organization, its community, and movement, with its inspiring mission and values. In addition, I m hoping that this also means I can once again dedicate a bit more attention to my FLOSS activities, such as my duties within the Debian project. My email address is back online: aborrero@wikimedia.org. You can find me again in the IRC libera.chat server, in the usual wikimedia channels, nick arturo.

Matthew Palmer: Not all TLDs are Created Equal

In light of the recent cancellation of the queer.af domain registration by the Taliban, the fragile and difficult nature of country-code top-level domains (ccTLDs) has once again been comprehensively demonstrated. Since many people may not be aware of the risks, I thought I d give a solid explainer of the whole situation, and explain why you should, in general, not have anything to do with domains which are registered under ccTLDs.

Top-level What-Now? A top-level domain (TLD) is the last part of a domain name (the collection of words, separated by periods, after the https:// in your web browser s location bar). It s the com in example.com, or the af in queer.af. There are two kinds of TLDs: country-code TLDs (ccTLDs) and generic TLDs (gTLDs). Despite all being TLDs, they re very different beasts under the hood.

What s the Difference? Generic TLDs are what most organisations and individuals register their domains under: old-school technobabble like com , net , or org , historical oddities like gov , and the new-fangled world of words like tech , social , and bank . These gTLDs are all regulated under a set of rules created and administered by ICANN (the Internet Corporation for Assigned Names and Numbers ), which try to ensure that things aren t a complete wild-west, limiting things like price hikes (well, sometimes, anyway), and providing means for disputes over names1. Country-code TLDs, in contrast, are all two letters long2, and are given out to countries to do with as they please. While ICANN kinda-sorta has something to do with ccTLDs (in the sense that it makes them exist on the Internet), it has no authority to control how a ccTLD is managed. If a country decides to raise prices by 100x, or cancel all registrations that were made on the 12th of the month, there s nothing anyone can do about it. If that sounds bad, that s because it is. Also, it s not a theoretical problem the Taliban deciding to asssert its bigotry over the little corner of the Internet namespace it has taken control of is far from the first time that ccTLDs have caused grief.

Shifting Sands The queer.af cancellation is interesting because, at the time the domain was reportedly registered, 2018, Afghanistan had what one might describe as, at least, a different political climate. Since then, of course, things have changed, and the new bosses have decided to get a bit more active. Those running queer.af seem to have seen the writing on the wall, and were planning on moving to another, less fraught, domain, but hadn t completed that move when the Taliban came knocking.

The Curious Case of Brexit When the United Kingdom decided to leave the European Union, it fell foul of the EU s rules for the registration of domains under the eu ccTLD3. To register (and maintain) a domain name ending in .eu, you have to be a resident of the EU. When the UK ceased to be part of the EU, residents of the UK were no longer EU residents. Cue much unhappiness, wailing, and gnashing of teeth when this was pointed out to Britons. Some decided to give up their domains, and move to other parts of the Internet, while others managed to hold onto them by various legal sleight-of-hand (like having an EU company maintain the registration on their behalf). In any event, all very unpleasant for everyone involved.

Geopolitics on the Internet?!? After Russia invaded Ukraine in February 2022, the Ukranian Vice Prime Minister asked ICANN to suspend ccTLDs associated with Russia. While ICANN said that it wasn t going to do that, because it wouldn t do anything useful, some domain registrars (the companies you pay to register domain names) ceased to deal in Russian ccTLDs, and some websites restricted links to domains with Russian ccTLDs. Whether or not you agree with the sort of activism implied by these actions, the fact remains that even the actions of a government that aren t directly related to the Internet can have grave consequences for your domain name if it s registered under a ccTLD. I don t think any gTLD operator will be invading a neighbouring country any time soon.

Money, Money, Money, Must Be Funny When you register a domain name, you pay a registration fee to a registrar, who does administrative gubbins and causes you to be able to control the domain name in the DNS. However, you don t own that domain name4 you re only renting it. When the registration period comes to an end, you have to renew the domain name, or you ll cease to be able to control it. Given that a domain name is typically your brand or identity online, the chances are you d prefer to keep it over time, because moving to a new domain name is a massive pain, having to tell all your customers or users that now you re somewhere else, plus having to accept the risk of someone registering the domain name you used to have and capturing your traffic it s all a gigantic hassle. For gTLDs, ICANN has various rules around price increases and bait-and-switch pricing that tries to keep a lid on the worst excesses of registries. While there are any number of reasonable criticisms of the rules, and the Internet community has to stay on their toes to keep ICANN from totally succumbing to regulatory capture, at least in the gTLD space there s some degree of control over price gouging. On the other hand, ccTLDs have no effective controls over their pricing. For example, in 2008 the Seychelles increased the price of .sc domain names from US$25 to US$75. No reason, no warning, just pay up .

Who Is Even Getting That Money? A closely related concern about ccTLDs is that some of the cool ones are assigned to countries that are not great. The poster child for this is almost certainly Libya, which has the ccTLD ly . While Libya was being run by a terrorist-supporting extremist, companies thought it was a great idea to have domain names that ended in .ly. These domain registrations weren t (and aren t) cheap, and it s hard to imagine that at least some of that money wasn t going to benefit the Gaddafi regime. Similarly, the British Indian Ocean Territory, which has the io ccTLD, was created in a colonialist piece of chicanery that expelled thousands of native Chagossians from Diego Garcia. Money from the registration of .io domains doesn t go to the (former) residents of the Chagos islands, instead it gets paid to the UK government. Again, I m not trying to suggest that all gTLD operators are wonderful people, but it s not particularly likely that the direct beneficiaries of the operation of a gTLD stole an island chain and evicted the residents.

Are ccTLDs Ever Useful? The answer to that question is an unqualified maybe . I certainly don t think it s a good idea to register a domain under a ccTLD for vanity purposes: because it makes a word, is the same as a file extension you like, or because it looks cool. Those ccTLDs that clearly represent and are associated with a particular country are more likely to be OK, because there is less impetus for the registry to try a naked cash grab. Unfortunately, ccTLD registries have a disconcerting habit of changing their minds on whether they serve their geographic locality, such as when auDA decided to declare an open season in the .au namespace some years ago. Essentially, while a ccTLD may have geographic connotations now, there s not a lot of guarantee that they won t fall victim to scope creep in the future. Finally, it might be somewhat safer to register under a ccTLD if you live in the location involved. At least then you might have a better idea of whether your domain is likely to get pulled out from underneath you. Unfortunately, as the .eu example shows, living somewhere today is no guarantee you ll still be living there tomorrow, even if you don t move house. In short, I d suggest sticking to gTLDs. They re at least lower risk than ccTLDs.

+1, Helpful If you ve found this post informative, why not buy me a refreshing beverage? My typing fingers (both of them) thank you in advance for your generosity.

Footnotes
  1. don t make the mistake of thinking that I approve of ICANN or how it operates; it s an omnishambles of poor governance and incomprehensible decision-making.
  2. corresponding roughly, though not precisely (because everything has to be complicated, because humans are complicated), to the entries in the ISO standard for Codes for the representation of names of countries and their subdivisions , ISO 3166.
  3. yes, the EU is not a country; it s part of the roughly, though not precisely caveat mentioned previously.
  4. despite what domain registrars try very hard to imply, without falling foul of deceptive advertising regulations.

12 February 2024

Andrew Cater: Lessons from (and for) colleagues - and, implicitly, how NOT to get on

I have had excellent colleagues both at my day job and, especially, in Debian over the last thirty-odd years. Several have attempted to give me good advice - others have been exemplars. People retire: sadly, people die. What impression do you want to leave behind when you leave here?Belatedly, I've come to realise that obduracy, sheer bloody mindedness, force of will and obstinacy will only get you so far. The following began very much as a tongue in cheek private memo to myself a good few years ago. I showed it to a colleague who suggested at the time that I should share it to a wider audience.

SOME ADVICE YOU MAY BENEFIT FROM

Personal conduct
  • Never argue with someone you believe to be arguing idiotically - a dispassionate bystander may have difficulty telling who's who.
  • You can't make yourself seem reasonable by behaving unreasonably
  • It does not matter how correct your point of view is if you get people's backs up
  • They may all be #####, @@@@@, %%%% and ******* - saying so out loud doesn't help improve matters and may make you seem intemperate.

Working with others
  • Be the change you want to be and behave the way you want others to behave in order to achieve the desired outcome.
  • You can demolish someone's argument constructively and add weight to good points rather than tearing down their ideas and hard work and being ultra-critical and negative - no-one likes to be told "You know - you've got a REALLY ugly baby there"
  • It's easier to work with someone than to work against them and have to apologise repeatedly.
  • Even when you're outstanding and superlative, even you had to learn it all once. Be generous to help others learn: you shouldn't have to teach too many times if you teach correctly once and take time in doing so.

Getting the message across
  • Stop: think: write: review: (peer review if necessary): publish.
  • Clarity is all: just because you understand it doesn't mean anyone else will.
  • It does not matter how correct your point of view is if you put it across badly.
  • If you're giving advice: make sure it is:
  • Considered
  • Constructive
  • Correct as far as you can (and)
  • Refers to other people who may be able to help
  • Say thank you promptly if someone helps you and be prepared to give full credit where credit's due.

Work is like that
  • You may not know all the answers or even have the whole picture - consult, take advice - LISTEN TO THE ADVICE
  • Sometimes the right answer is not the immediately correct answer
  • Corollary: Sometimes the right answer for the business is not your suggested/preferred outcome
  • Corollary: Just because you can do it like that in the real world doesn't mean that you can do it that way inside the business. [This realisation is INTENSELY frustrating but you have to learn to deal with it]
  • DON'T ALWAYS DO IT YOURSELF - Attempt to fix the system, sometimes allow the corporate monster to fail - then do it yourself and fix it. It is always easier and tempting to work round the system and Just Flaming Do It but it doesn't solve problems in the longer term and may create more problems and ill-feeling than it solves.

    [Worked out for Andy Cater for himself after many years of fighting the system as a misguided missile - though he will freely admit that he doesn't always follow them as often as he should :) ]


11 February 2024

Marco d'Itri: Extending access to the systemd RuntimeDirectory with a POSIX ACL

inn2 uses ephemeral UNIX domain sockets in /run/news/ to communicate with the ctlinnd program. Since the directory is only writeable by the "news" user, other unprivileged users are not able to use the command. I solved this by extending the inn2.service systemd unit with a drop-in file which uses setfacl to give access to my user "md" to the RuntimeDirectory created by systemd. This is the content of /etc/systemd/system/inn2.service.d/md-ctlinnd.conf:
[Service]
# innd will change the permissions of /run/news/ when started: without
# creating it now with mode 0775 then that will change the ACL mask.
RuntimeDirectoryMode=0775
# allow user md to run ctlinnd(8), which creates sockets in /run/news/
ExecStartPost=/usr/bin/setfacl --modify user:md:rwx $RUNTIME_DIRECTORY
The non-obvious issue here is that the innd daemon on startup will change the directory permissions in a way which sets a more restrictive (non group-writeable) ACL mask, and this would make the newly created user ACL ineffective. The solution is to create the directory group-writeable from start. (Beware: this creates a trivial privileges escalation from md to news.)

10 February 2024

Andrew Cater: Debian point releases - updated media for Bullseye (11.9) and Bookworm (12.5) - 2024-02-10

It's been a LONG day: two point releases in a day takes of the order of twelve or thirteen hours of fairly solid work on behalf of those doing the releases and testing.Thanks firstly to the main Debian release team for all the initial work.Thanks to Isy, RattusRattus, Sledge and egw in Cottenham, smcv and Helen closer to the centre of Cambridge, cacin and others who have dropped in and out of IRC and helped testing.

I've been at home but active on IRC - missing the team (and the food) and drinking far too much coffee/eating too many biscuits.

We've found relatively few bugs that we haven't previously noted: it's been a good day. Back again, at some point a couple of months from now to do this all over again.With luck, I can embed a picture of the Cottenham folk below: it's fun to know _exactly_ where people are because you've been there yourself.



9 February 2024

Thorsten Alteholz: My Debian Activities in January 2024

FTP master This month I accepted 333 and rejected 31 packages. The overall number of packages that got accepted was 342.

Hooray, I already accepted package number 30000.

The statistic, where I get my numbers from, started in February 2002. Up to now 81694 packages got accepted. Given that I accepted package 20000 in October 2020, would I be able to accept half of the packages that made it through NEW? Debian LTS This was my hundred-fifteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded: This month I was finally able to really run the test suite of bind9. I already wanted to give up with this package, but Santiago encouraged me to proceed. So, here you are fixed-Buster-version. Jessie and Stretch have to wait a bit until the dust has settled. Last but not least I also did a few days of frontdesk duties. Debian ELTS This month was the sixty-sixth ELTS month. During my allocated time I uploaded: This month I also worked on the Jessie and Stretch updates for bind9. The uploads should happen soon. I also started to work on an update for exim4. Last but not least I did a few days of frontdesk duties. Debian Printing This month I adopted: At the moment these packages are the last adoptions to preserve the old printing protocol within Debian. If you know of other packages that should be retained, please don t hesitate to ask me. But don t wait for too long, I have fun to process RM-bugs :-). This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream version of: Debian IoT This month I uploaded new upstream versions of: Other stuff This month I uploaded new upstream version of packages, did a source upload for the transition or uploaded it to fix one or the other issue:

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 February 2024

Ian Jackson: UPS, the Useless Parcel Service; VAT and fees

I recently had the most astonishingly bad experience with UPS, the courier company. They severely damaged my parcels, and were very bad about UK import VAT, ultimately ending up harassing me on autopilot. The only thing that got their attention was my draft Particulars of Claim for intended legal action. Surprisingly, I got them to admit in writing that the disbursement fee they charge recipients alongside the actual VAT, is just something they made up with no legal basis. What happened Autumn last year I ordered some furniture from a company in Germany. This was to be shipped by them to me by courier. The supplier chose UPS. UPS misrouted one of the three parcels to Denmark. When everything arrived, it had been sat on by elephants. The supplier had to replace most of it, with considerable inconvenience and delay to me, and of course a loss to the supplier. But this post isn t mostly about that. This post is about VAT. You see, import VAT was due, because of fucking Brexit. UPS made a complete hash of collecting that VAT. Their computers can t issue coherent documents, their email helpdesk is completely useless, and their automated debt collection systems run along uninfluenced by any external input. The crazy, including legal threats and escalating late payment fees, continued even after I paid the VAT discrepancy (which I did despite them not yet having provided any coherent calculation for it). This kind of behaviour is a very small and mild version of the kind of things British Gas did to Lisa Ferguson, who eventually won substantial damages for harassment, plus 10K of costs. Having tried asking nicely, and sending stiff letters, I too threatened litigation. I would have actually started a court claim, but it would have included a claim under the Protection from Harassment Act. Those have to be filed under the Part 8 procedure , which involves sending all of the written evidence you re going to use along with the claim form. Collating all that would be a good deal of work, especially since UPS and ControlAccount didn t engage with me at all, so I had no idea which things they might actually dispute. So I decided that before issuing proceedings, I d send them a copy of my draft Particulars of Claim, along with an offer to settle if they would pay me a modest sum and stop being evil robots at me. Rather than me typing the whole tale in again, you can read the full gory details in the PDF of my draft Particulars of Claim. (I ve redacted the reference numbers). Outcome The draft Particulars finally got their attention. UPS sent me an offer: they agreed to pay me 50, in full and final settlement. That was close enough to my offer that I accepted it. I mostly wanted them to stop, and they do seem to have done so. And I ve received the 50. VAT calculation They also finally included an actual explanation of the VAT calculation. It s absurd, but it s not UPS s absurd:
The clearance was entered initially with estimated import charges of 400.03, consisting of 387.83 VAT, and 12.20 disbursement fee. This original entry regrettably did not include the freight cost for calculating the VAT, and as such when submitted for final entry the VAT value was adjusted to include this and an amended invoice was issued for an additional 39.84. HMRC calculate the amount against which VAT is raised using the value of goods, insurance and freight, however they also may apply a VAT adjustment figure. The VAT Adjustment is based on many factors (Incidental costs in regards to a shipment), which includes charge for currency conversion if the invoice does not list values in Sterling, but the main is due to the inland freight from airport of destination to the final delivery point, as this charge varies, for example, from EMA to Edinburgh would be 150, from EMA to Derby would be 1, so each year UPS must supply HMRC with all values incurred for entry build up and they give an average which UPS have to use on the entry build up as the VAT Adjustment. The correct calculation for the import charges is therefore as follows: Goods value divided by exchange rate 2,489.53 EUR / 1.1683 = 2,130.89 GBP Duty: Goods value plus freight (%) 2,130.89 GBP + 5% = 2,237.43 GBP. That total times the duty rate. X 0 % = 0 GBP VAT: Goods value plus freight (100%) 2,130.89 GBP + 0 = 2,130.89 GBP That total plus duty and VAT adjustment 2,130.89 GBP + 0 GBP + 7.49 GBP = 2,348.08 GBP. That total times 20% VAT = 427.67 GBP As detailed above we must confirm that the final VAT charges applied to the shipment were correct, and that no refund of this is therefore due.
This looks very like HMRC-originated nonsense. If only they had put it on the original bills! It s completely ridiculous that it took four months and near-litigation to obtain it. Disbursement fee One more thing. UPS billed me a 12 disbursement fee . When you import something, there s often tax to pay. The courier company pays that to the government, and the consignee pays it to the courier. Usually the courier demands it before final delivery, since otherwise they end up having to chase it as a debt. It is common for parcel companies to add a random fee of their own. As I note in my Particulars, there isn t any legal basis for this. In my own offer of settlement I proposed that UPS should:
State under what principle of English law (such as, what enactment or principle of Common Law), you levy the disbursement fee (or refund it).
To my surprise they actually responded to this in their own settlement letter. (They didn t, for example, mention the harassment at all.) They said (emphasis mine):
A disbursement fee is a fee for amounts paid or processed on behalf of a client. It is an established category of charge used by legal firms, amongst other companies, for billing of various ancillary costs which may be incurred in completion of service. Disbursement fees are not covered by a specific law, nor are they legally prohibited. Regarding UPS disbursement fee this is an administrative charge levied for the use of UPS deferment account to prepay import charges for clearance through CDS. This charge would therefore be billed to the party that is responsible for the import charges, normally the consignee or receiver of the shipment in question. The disbursement fee as applied is legitimate, and as you have stated is a commonly used and recognised charge throughout the courier industry, and I can confirm that this was charged correctly in this instance.
On UPS s analysis, they can just make up whatever fee they like. That is clearly not right (and I don t even need to refer to consumer protection law, which would also make it obviously unlawful). And, that everyone does it doesn t make it lawful. There are so many things that are ubiquitous but unlawful, especially nowadays when much of the legal system - especially consumer protection regulators - has been underfunded to beyond the point of collapse. Next time this comes up I might have a go at getting the fee back. (Obviously I ll have to pay it first, to get my parcel.) ParcelForce and Royal Mail I think this analysis doesn t apply to ParcelForce and (probably) Royal Mail. I looked into this in 2009, and I found that Parcelforce had been given the ability to write their own private laws: Schemes made under section 89 of the Postal Services Act 2000. This is obviously ridiculous but I think it was the law in 2009. I doubt the intervening governments have fixed it. Furniture Oh, yes, the actual furniture. The replacements arrived intact and are great :-).

comment count unavailable comments

1 February 2024

Russ Allbery: Review: System Collapse

Review: System Collapse, by Martha Wells
Series: Murderbot Diaries #7
Publisher: Tordotcom
Copyright: 2023
ISBN: 1-250-82698-5
Format: Kindle
Pages: 245
System Collapse is the second Murderbot novel. Including the novellas, it's the 7th in the series. Unlike Fugitive Telemetry, the previous novella that was out of chronological order, this is the direct sequel to Network Effect. A very direct sequel; it picks up just a few days after the previous novel ended. Needless to say, you should not start here. I was warned by other people and therefore re-read Network Effect immediately before reading System Collapse. That was an excellent idea, since this novel opens with a large cast, no dramatis personae, not much in the way of a plot summary, and a lot of emotional continuity from the previous novel. I would grumble about this more, like I have in other reviews, but I thoroughly enjoyed re-reading Network Effect and appreciated the excuse.
ART-drone said, I wouldn t recommend it. I lack a sense of proportional response. I don t advise engaging with me on any level.
Saying much about the plot of this book without spoiling Network Effect and the rest of the series is challenging. Murderbot is suffering from the aftereffects of the events of the previous book more than it expected or would like to admit. It and its humans are in the middle of a complicated multi-way negotiation with some locals, who the corporates are trying to exploit. One of the difficulties in that negotiation is getting people to believe that the corporations are as evil as they actually are, a plot element that has a depressing amount in common with current politics. Meanwhile, Murderbot is trying to keep everyone alive. I loved Network Effect, but that was primarily for the social dynamics. The planet that was central to the novel was less interesting, so another (short) novel about the same planet was a bit of a disappointment. This does give Wells a chance to show in more detail what Murderbot's new allies have been up to, but there is a lot of speculative exploration and detailed descriptions of underground tunnels that I found less compelling than the relationship dynamics of the previous book. (Murderbot, on the other hand, would much prefer exploring creepy abandoned tunnels to talking about its feelings.) One of the things this series continues to do incredibly well, though, is take non-human intelligence seriously in a world where the humans mostly don't. It perfectly fills a gap between Star Wars, where neither the humans nor the story take non-human intelligences seriously (hence the creepy slavery vibes as soon as you start paying attention to droids), and the Culture, where both humans and the story do. The corporates (the bad guys in this series) treat non-human intelligences the way Star Wars treats droids. The good guys treat Murderbot mostly like a strange human, which is better but still wrong, and still don't notice the numerous other machine intelligences. But Wells, as the author, takes all of the non-human characters seriously, which means there are complex and fascinating relationships happening at a level of the story that the human characters are mostly unaware of. I love that Murderbot rarely bothers to explain; if the humans are too blinkered to notice, that's their problem. About halfway into the story, System Collapse hits its stride, not coincidentally at the point where Murderbot befriends some new computers. The rest of the book is great. This was not as good as Network Effect. There is a bit less competence porn at the start, and although that's for good in-story reasons I still missed it. Murderbot's redaction of things it doesn't want to talk about got a bit annoying before it finally resolved. And I was not sufficiently interested in this planet to want to spend two novels on it, at least without another major revelation that didn't come. But it's still a Murderbot novel, which means it has the best first-person narrative voice I've ever read, some great moments, and possibly the most compelling and varied presentation of computer intelligence in science fiction at the moment.
There was no feed ID, but AdaCol2 supplied the name Lucia and when I asked it for more info, the gender signifier bb (which didn t translate) and he/him pronouns. (I asked because the humans would bug me for the information; I was as indifferent to human gender as it was possible to be without being unconscious.)
This is not a series to read out of order, but if you have read this far, you will continue to be entertained. You don't need me to tell you this nearly everyone reviewing science fiction is saying it but this series is great and you should read it. Rating: 8 out of 10

30 January 2024

Antoine Beaupr : router archeology: the Soekris net5001

Roadkiller was a Soekris net5501 router I used as my main gateway between 2010 and 2016 (for r seau and t l phone). It was upgraded to FreeBSD 8.4-p12 (2014-06-06) and pkgng. It was retired in favor of octavia around 2016. Roughly 10 years later (2024-01-24), I found it in a drawer and, to my surprised, it booted. After wrangling with a RS-232 USB adapter, a null modem cable, and bit rates, I even logged in:
comBIOS ver. 1.33  20070103  Copyright (C) 2000-2007 Soekris Engineering.
net5501
0512 Mbyte Memory                        CPU Geode LX 500 Mhz 
Pri Mas  WDC WD800VE-00HDT0              LBA Xlt 1024-255-63  78 Gbyte
Slot   Vend Dev  ClassRev Cmd  Stat CL LT HT  Base1    Base2   Int 
-------------------------------------------------------------------
0:01:2 1022 2082 10100000 0006 0220 08 00 00 A0000000 00000000 10
0:06:0 1106 3053 02000096 0117 0210 08 40 00 0000E101 A0004000 11
0:07:0 1106 3053 02000096 0117 0210 08 40 00 0000E201 A0004100 05
0:08:0 1106 3053 02000096 0117 0210 08 40 00 0000E301 A0004200 09
0:09:0 1106 3053 02000096 0117 0210 08 40 00 0000E401 A0004300 12
0:20:0 1022 2090 06010003 0009 02A0 08 40 80 00006001 00006101 
0:20:2 1022 209A 01018001 0005 02A0 08 00 00 00000000 00000000 
0:21:0 1022 2094 0C031002 0006 0230 08 00 80 A0005000 00000000 15
0:21:1 1022 2095 0C032002 0006 0230 08 00 00 A0006000 00000000 15
 4 Seconds to automatic boot.   Press Ctrl-P for entering Monitor.
 
                                            
                                                  ______
                                                    ____  __ ___  ___ 
            Welcome to FreeBSD!                     __   '__/ _ \/ _ \
                                                    __       __/  __/
                                                                      
    1. Boot FreeBSD [default]                     _     _   \___ \___ 
    2. Boot FreeBSD with ACPI enabled             ____   _____ _____
    3. Boot FreeBSD in Safe Mode                    _ \ / ____   __ \
    4. Boot FreeBSD in single user mode             _)   (___         
    5. Boot FreeBSD with verbose logging            _ < \___ \        
    6. Escape to loader prompt                      _)  ____)    __   
    7. Reboot                                                         
                                                  ____/ _____/ _____/
                                            
                                            
                                            
    Select option, [Enter] for default      
    or [Space] to pause timer  5            
  
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.4-RELEASE-p12 #5: Fri Jun  6 02:43:23 EDT 2014
    root@roadkiller.anarc.at:/usr/obj/usr/src/sys/ROADKILL i386
gcc version 4.2.2 20070831 prerelease [FreeBSD]
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Geode(TM) Integrated Processor by AMD PCS (499.90-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x5a2  Family = 5  Model = a  Stepping = 2
  Features=0x88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CLFLUSH,MMX>
  AMD Features=0xc0400000<MMX+,3DNow!+,3DNow!>
real memory  = 536870912 (512 MB)
avail memory = 506445824 (482 MB)
kbd1 at kbdmux0
K6-family MTRR support enabled (2 registers)
ACPI Error: A valid RSDP was not found (20101013/tbxfroot-309)
ACPI: Table initialisation failed: AE_NOT_FOUND
ACPI: Try disabling either ACPI or apic support.
cryptosoft0: <software crypto> on motherboard
pcib0 pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
Geode LX: Soekris net5501 comBIOS ver. 1.33 20070103 Copyright (C) 2000-2007
pci0: <encrypt/decrypt, entertainment crypto> at device 1.2 (no driver attached)
vr0: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe100-0xe1ff mem 0xa0004000-0xa00040ff irq 11 at device 6.0 on pci0
vr0: Quirks: 0x2
vr0: Revision: 0x96
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr0: Ethernet address: 00:00:24:cc:93:44
vr0: [ITHREAD]
vr1: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe200-0xe2ff mem 0xa0004100-0xa00041ff irq 5 at device 7.0 on pci0
vr1: Quirks: 0x2
vr1: Revision: 0x96
miibus1: <MII bus> on vr1
ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1
ukphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr1: Ethernet address: 00:00:24:cc:93:45
vr1: [ITHREAD]
vr2: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe300-0xe3ff mem 0xa0004200-0xa00042ff irq 9 at device 8.0 on pci0
vr2: Quirks: 0x2
vr2: Revision: 0x96
miibus2: <MII bus> on vr2
ukphy2: <Generic IEEE 802.3u media interface> PHY 1 on miibus2
ukphy2:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr2: Ethernet address: 00:00:24:cc:93:46
vr2: [ITHREAD]
vr3: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xa0004300-0xa00043ff irq 12 at device 9.0 on pci0
vr3: Quirks: 0x2
vr3: Revision: 0x96
miibus3: <MII bus> on vr3
ukphy3: <Generic IEEE 802.3u media interface> PHY 1 on miibus3
ukphy3:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr3: Ethernet address: 00:00:24:cc:93:47
vr3: [ITHREAD]
isab0: <PCI-ISA bridge> at device 20.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD CS5536 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 20.2 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata0: [ITHREAD]
ata1: <ATA channel> at channel 1 on atapci0
ata1: [ITHREAD]
ohci0: <OHCI (generic) USB controller> mem 0xa0005000-0xa0005fff irq 15 at device 21.0 on pci0
ohci0: [ITHREAD]
usbus0 on ohci0
ehci0: <AMD CS5536 (Geode) USB 2.0 controller> mem 0xa0006000-0xa0006fff irq 15 at device 21.1 on pci0
ehci0: [ITHREAD]
usbus1: EHCI version 1.0
usbus1 on ehci0
cpu0 on motherboard
pmtimer0 on isa0
orm0: <ISA Option ROM> at iomem 0xc8000-0xd27ff pnpid ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
atrtc0: <AT Real Time Clock> at port 0x70 irq 8 on isa0
ppc0: parallel port not found.
uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
uart0: [FILTER]
uart0: console (19200,n,8,1)
uart1: <16550 or compatible> at port 0x2f8-0x2ff irq 3 on isa0
uart1: [FILTER]
Timecounter "TSC" frequency 499903982 Hz quality 800
Timecounters tick every 1.000 msec
IPsec: Initialized Security Association Processing.
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 480Mbps High Speed USB v2.0
ad0: 76319MB <WDC WD800VE-00HDT0 09.07D09> at ata0-master UDMA100 
ugen0.1: <AMD> at usbus0
uhub0: <AMD OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <AMD> at usbus1
uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
GEOM: ad0s1: geometry does not match label (255h,63s != 16h,63s).
uhub0: 4 ports with 4 removable, self powered
Root mount waiting for: usbus1
Root mount waiting for: usbus1
uhub1: 4 ports with 4 removable, self powered
Trying to mount root from ufs:/dev/ad0s1a
The last log rotation is from 2016:
[root@roadkiller /var/log]# stat /var/log/wtmp      
65 61783 -rw-r--r-- 1 root wheel 208219 1056 "Nov  1 05:00:01 2016" "Jan 18 22:29:16 2017" "Jan 18 22:29:16 2017" "Nov  1 05:00:01 2016" 16384 4 0 /var/log/wtmp
Interestingly, I switched between eicat and teksavvy on December 11th. Which year? Who knows!
Dec 11 16:38:40 roadkiller mpd: [eicatL0] LCP: authorization successful
Dec 11 16:41:15 roadkiller mpd: [teksavvyL0] LCP: authorization successful
Never realized those good old logs had a "oh dear forgot the year" issue (that's something like Y2K except just "Y", I guess). That was probably 2015, because the log dates from 2017, and the last entry is from November of the year after the above:
[root@roadkiller /var/log]# stat mpd.log 
65 47113 -rw-r--r-- 1 root wheel 193008 71939195 "Jan 18 22:39:18 2017" "Jan 18 22:39:59 2017" "Jan 18 22:39:59 2017" "Apr  2 10:41:37 2013" 16384 140640 0 mpd.log
It looks like the system was installed in 2010:
[root@roadkiller /var/log]# stat /
63 2 drwxr-xr-x 21 root wheel 2120 512 "Jan 18 22:34:43 2017" "Jan 18 22:28:12 2017" "Jan 18 22:28:12 2017" "Jul 18 22:25:00 2010" 16384 4 0 /
... so it lived for about 6 years, but still works after almost 14 years, which I find utterly amazing. Another amazing thing is that there's tuptime installed on that server! That is a software I thought I discovered later and then sponsored in Debian, but turns out I was already using it then!
[root@roadkiller /var]# tuptime 
System startups:        19   since   21:20:16 11/07/15
System shutdowns:       0 ok   -   18 bad
System uptime:          85.93 %   -   1 year, 11 days, 10 hours, 3 minutes and 36 seconds
System downtime:        14.07 %   -   61 days, 15 hours, 22 minutes and 45 seconds
System life:            1 year, 73 days, 1 hour, 26 minutes and 20 seconds
Largest uptime:         122 days, 9 hours, 17 minutes and 6 seconds   from   08:17:56 02/02/16
Shortest uptime:        5 minutes and 4 seconds   from   21:55:00 01/18/17
Average uptime:         19 days, 19 hours, 28 minutes and 37 seconds
Largest downtime:       57 days, 1 hour, 9 minutes and 59 seconds   from   20:45:01 11/22/16
Shortest downtime:      -1 years, 364 days, 23 hours, 58 minutes and 12 seconds   from   22:30:01 01/18/17
Average downtime:       3 days, 5 hours, 51 minutes and 43 seconds
Current uptime:         18 minutes and 23 seconds   since   22:28:13 01/18/17
Actual up/down times:
[root@roadkiller /var]# tuptime -t
No.        Startup Date                                         Uptime       Shutdown Date   End                                                  Downtime
1     21:20:16 11/07/15      1 day, 0 hours, 40 minutes and 12 seconds   22:00:28 11/08/15   BAD                                  2 minutes and 37 seconds
2     22:03:05 11/08/15      1 day, 9 hours, 41 minutes and 57 seconds   07:45:02 11/10/15   BAD                                  3 minutes and 24 seconds
3     07:48:26 11/10/15    20 days, 2 hours, 41 minutes and 34 seconds   10:30:00 11/30/15   BAD                        4 hours, 50 minutes and 21 seconds
4     15:20:21 11/30/15                      19 minutes and 40 seconds   15:40:01 11/30/15   BAD                                   6 minutes and 5 seconds
5     15:46:06 11/30/15                      53 minutes and 55 seconds   16:40:01 11/30/15   BAD                           1 hour, 1 minute and 38 seconds
6     17:41:39 11/30/15     6 days, 16 hours, 3 minutes and 22 seconds   09:45:01 12/07/15   BAD                4 days, 6 hours, 53 minutes and 11 seconds
7     16:38:12 12/11/15   50 days, 17 hours, 56 minutes and 49 seconds   10:35:01 01/31/16   BAD                                 10 minutes and 52 seconds
8     10:45:53 01/31/16     1 day, 21 hours, 28 minutes and 16 seconds   08:14:09 02/02/16   BAD                                  3 minutes and 48 seconds
9     08:17:56 02/02/16    122 days, 9 hours, 17 minutes and 6 seconds   18:35:02 06/03/16   BAD                                 10 minutes and 16 seconds
10    18:45:18 06/03/16   29 days, 17 hours, 14 minutes and 43 seconds   12:00:01 07/03/16   BAD                                 12 minutes and 34 seconds
11    12:12:35 07/03/16   31 days, 17 hours, 17 minutes and 26 seconds   05:30:01 08/04/16   BAD                                 14 minutes and 25 seconds
12    05:44:26 08/04/16     15 days, 1 hour, 55 minutes and 35 seconds   07:40:01 08/19/16   BAD                                  6 minutes and 51 seconds
13    07:46:52 08/19/16     7 days, 5 hours, 23 minutes and 10 seconds   13:10:02 08/26/16   BAD                                  3 minutes and 45 seconds
14    13:13:47 08/26/16   27 days, 21 hours, 36 minutes and 14 seconds   10:50:01 09/23/16   BAD                                  2 minutes and 14 seconds
15    10:52:15 09/23/16   60 days, 10 hours, 52 minutes and 46 seconds   20:45:01 11/22/16   BAD                 57 days, 1 hour, 9 minutes and 59 seconds
16    21:55:00 01/18/17                        5 minutes and 4 seconds   22:00:04 01/18/17   BAD                                 11 minutes and 15 seconds
17    22:11:19 01/18/17                       8 minutes and 42 seconds   22:20:01 01/18/17   BAD                                   1 minute and 20 seconds
18    22:21:21 01/18/17                       8 minutes and 40 seconds   22:30:01 01/18/17   BAD   -1 years, 364 days, 23 hours, 58 minutes and 12 seconds
19    22:28:13 01/18/17                      20 minutes and 17 seconds
The last few entries are actually the tests I'm running now, it seems this machine thinks we're now on 2017-01-18 at ~22:00, while we're actually 2024-01-24 at ~12:00 local:
Wed Jan 18 23:05:38 EST 2017
FreeBSD/i386 (roadkiller.anarc.at) (ttyu0)
login: root
Password:
Jan 18 23:07:10 roadkiller login: ROOT LOGIN (root) ON ttyu0
Last login: Wed Jan 18 22:29:16 on ttyu0
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 8.4-RELEASE-p12 (ROADKILL) #5: Fri Jun  6 02:43:23 EDT 2014
Reminders:
 * commit stuff in /etc
 * reload firewall (in screen!):
    pfctl -f /etc/pf.conf ; sleep 1
 * vim + syn on makes pf.conf more readable
 * monitoring the PPPoE uplink:
   tail -f /var/log/mpd.log
Current problems:
 * sometimes pf doesn't start properly on boot, if pppoe failed to come up, use
   this to resume:
     /etc/rc.d/pf start
   it will kill your shell, but fix NAT (2012-08-10)
 * babel fails to start on boot (2013-06-15):
     babeld -D -g 33123 tap0 vr3
 * DNS often fails, tried messing with unbound.conf (2014-10-05) and updating
   named.root (2016-01-28) and performance tweaks (ee63689)
 * asterisk and mpd4 are deprecated and should be uninstalled when we're sure
   their replacements (voipms + ata and mpd5) are working (2015-01-13)
 * if IPv6 fails, it's because netblocks are not being routed upstream. DHCPcd
   should do this, but doesn't start properly, use this to resume (2015-12-21):
     /usr/local/sbin/dhcpcd -6 --persistent --background --timeout 0 -C resolv.conf ng0
This machine is doomed to be replaced with the new omnia router, Indiegogo
campaign should ship in april 2016: http://igg.me/at/turris-omnia/x
(I really like the motd I left myself there. In theory, I guess this could just start connecting to the internet again if I still had the same PPPoE/ADSL link I had almost a decade ago; obviously, I do not.) Not sure how the system figured the 2017 time: the onboard clock itself believes we're in 1980, so clearly the CMOS battery has (understandably) failed:
> ?
comBIOS Monitor Commands
boot [drive][:partition] INT19 Boot
reboot                   cold boot
download                 download a file using XMODEM/CRC
flashupdate              update flash BIOS with downloaded file
time [HH:MM:SS]          show or set time
date [YYYY/MM/DD]        show or set date
d[b w d] [adr]           dump memory bytes/words/dwords
e[b w d] adr value [...] enter bytes/words/dwords
i[b w d] port            input from 8/16/32-bit port
o[b w d] port value      output to 8/16/32-bit port
run adr                  execute code at adr
cmosread [adr]           read CMOS RAM data
cmoswrite adr byte [...] write CMOS RAM data
cmoschecksum             update CMOS RAM Checksum
set parameter=value      set system parameter to value
show [parameter]         show one or all system parameters
?/help                   show this help
> show
ConSpeed = 19200
ConLock = Enabled
ConMute = Disabled
BIOSentry = Enabled
PCIROMS = Enabled
PXEBoot = Enabled
FLASH = Primary
BootDelay = 5
FastBoot = Disabled
BootPartition = Disabled
BootDrive = 80 81 F0 FF 
ShowPCI = Enabled
Reset = Hard
CpuSpeed = Default
> time
Current Date and Time is: 1980/01/01 00:56:47
Another bit of archeology: I had documented various outages with my ISP... back in 2003!
[root@roadkiller ~/bin]# cat ppp_stats/downtimes.txt
11/03/2003 18:24:49 218
12/03/2003 09:10:49 118
12/03/2003 10:05:57 680
12/03/2003 10:14:50 106
12/03/2003 10:16:53 6
12/03/2003 10:35:28 146
12/03/2003 10:57:26 393
12/03/2003 11:16:35 5
12/03/2003 11:16:54 11
13/03/2003 06:15:57 18928
13/03/2003 09:43:36 9730
13/03/2003 10:47:10 23
13/03/2003 10:58:35 5
16/03/2003 01:32:36 338
16/03/2003 02:00:33 120
16/03/2003 11:14:31 14007
19/03/2003 00:56:27 11179
19/03/2003 00:56:43 5
19/03/2003 00:56:53 0
19/03/2003 00:56:55 1
19/03/2003 00:57:09 1
19/03/2003 00:57:10 1
19/03/2003 00:57:24 1
19/03/2003 00:57:25 1
19/03/2003 00:57:39 1
19/03/2003 00:57:40 1
19/03/2003 00:57:44 3
19/03/2003 00:57:53 0
19/03/2003 00:57:55 0
19/03/2003 00:58:08 0
19/03/2003 00:58:10 0
19/03/2003 00:58:23 0
19/03/2003 00:58:25 0
19/03/2003 00:58:39 1
19/03/2003 00:58:42 2
19/03/2003 00:58:58 5
19/03/2003 00:59:35 2
19/03/2003 00:59:47 3
19/03/2003 01:00:34 3
19/03/2003 01:00:39 0
19/03/2003 01:00:54 0
19/03/2003 01:01:11 2
19/03/2003 01:01:25 1
19/03/2003 01:01:48 1
19/03/2003 01:02:03 1
19/03/2003 01:02:10 2
19/03/2003 01:02:20 3
19/03/2003 01:02:44 3
19/03/2003 01:03:45 3
19/03/2003 01:04:39 2
19/03/2003 01:05:40 2
19/03/2003 01:06:35 2
19/03/2003 01:07:36 2
19/03/2003 01:08:31 2
19/03/2003 01:08:38 2
19/03/2003 01:10:07 3
19/03/2003 01:11:05 2
19/03/2003 01:12:03 3
19/03/2003 01:13:01 3
19/03/2003 01:13:58 2
19/03/2003 01:14:59 5
19/03/2003 01:15:54 2
19/03/2003 01:16:55 2
19/03/2003 01:17:50 2
19/03/2003 01:18:51 3
19/03/2003 01:19:46 2
19/03/2003 01:20:46 2
19/03/2003 01:21:42 3
19/03/2003 01:22:42 3
19/03/2003 01:23:37 2
19/03/2003 01:24:38 3
19/03/2003 01:25:33 2
19/03/2003 01:26:33 2
19/03/2003 01:27:30 3
19/03/2003 01:28:55 2
19/03/2003 01:29:56 2
19/03/2003 01:30:50 2
19/03/2003 01:31:42 3
19/03/2003 01:32:36 3
19/03/2003 01:33:27 2
19/03/2003 01:34:21 2
19/03/2003 01:35:22 2
19/03/2003 01:36:17 3
19/03/2003 01:37:18 2
19/03/2003 01:38:13 3
19/03/2003 01:39:39 2
19/03/2003 01:40:39 2
19/03/2003 01:41:35 3
19/03/2003 01:42:35 3
19/03/2003 01:43:31 3
19/03/2003 01:44:31 3
19/03/2003 01:45:53 3
19/03/2003 01:46:48 3
19/03/2003 01:47:48 2
19/03/2003 01:48:44 3
19/03/2003 01:49:44 2
19/03/2003 01:50:40 3
19/03/2003 01:51:39 1
19/03/2003 11:04:33 19   
19/03/2003 18:39:36 2833 
19/03/2003 18:54:05 825  
19/03/2003 19:04:00 454  
19/03/2003 19:08:11 210  
19/03/2003 19:41:44 272  
19/03/2003 21:18:41 208  
24/03/2003 04:51:16 6
27/03/2003 04:51:20 5
30/03/2003 04:51:25 5
31/03/2003 08:30:31 255  
03/04/2003 08:30:36 5
06/04/2003 01:16:00 621  
06/04/2003 22:18:08 17   
06/04/2003 22:32:44 13   
09/04/2003 22:33:12 28   
12/04/2003 22:33:17 6
15/04/2003 22:33:22 5
17/04/2003 15:03:43 18   
20/04/2003 15:03:48 5
23/04/2003 15:04:04 16   
23/04/2003 21:08:30 339  
23/04/2003 21:18:08 13   
23/04/2003 23:34:20 253  
26/04/2003 23:34:45 25   
29/04/2003 23:34:49 5
02/05/2003 13:10:01 185  
05/05/2003 13:10:06 5
08/05/2003 13:10:11 5
09/05/2003 14:00:36 63928
09/05/2003 16:58:52 2
11/05/2003 23:08:48 2
14/05/2003 23:08:53 6
17/05/2003 23:08:58 5
20/05/2003 23:09:03 5
23/05/2003 23:09:08 5
26/05/2003 23:09:14 5
29/05/2003 23:00:10 3
29/05/2003 23:03:01 10   
01/06/2003 23:03:05 4
04/06/2003 23:03:10 5
07/06/2003 23:03:38 28   
10/06/2003 23:03:50 12   
13/06/2003 23:03:55 6
14/06/2003 07:42:20 3
14/06/2003 14:37:08 3
15/06/2003 20:08:34 3
18/06/2003 20:08:39 6
21/06/2003 20:08:45 6
22/06/2003 03:05:19 138  
22/06/2003 04:06:28 3
25/06/2003 04:06:58 31   
28/06/2003 04:07:02 4
01/07/2003 04:07:06 4
04/07/2003 04:07:11 5
07/07/2003 04:07:16 5
12/07/2003 04:55:20 6
12/07/2003 19:09:51 1158 
12/07/2003 22:14:49 8025 
15/07/2003 22:14:54 6
16/07/2003 05:43:06 18   
19/07/2003 05:43:12 6
22/07/2003 05:43:17 5
23/07/2003 18:18:55 183  
23/07/2003 18:19:55 9
23/07/2003 18:29:15 158  
23/07/2003 19:48:44 4604 
23/07/2003 20:16:27 3
23/07/2003 20:37:29 1079 
23/07/2003 20:43:12 342  
23/07/2003 22:25:51 6158
Fascinating. I suspect the (IDE!) hard drive might be failing as I saw two new files created in /var that I didn't remember seeing before:
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 3@T3
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 DY5
So I shutdown the machine, possibly for the last time:
Waiting (max 60 seconds) for system process  bufdaemon' to stop...done
Waiting (max 60 seconds) for system process  syncer' to stop...
Syncing disks, vnodes remaining...3 3 0 1 1 0 0 done
All buffers synced.
Uptime: 36m43s
usbus0: Controller shutdown
uhub0: at usbus0, port 1, addr 1 (disconnected)
usbus0: Controller shutdown complete
usbus1: Controller shutdown
uhub1: at usbus1, port 1, addr 1 (disconnected)
usbus1: Controller shutdown complete
The operating system has halted.
Please press any key to reboot.
I'll finally note this was the last FreeBSD server I personally operated. I also used FreeBSD to setup the core routers at Koumbit but those were replaced with Debian recently as well. Thanks Soekris, that was some sturdy hardware. Hopefully this new Protectli router will live up to that "decade plus" challenge. Not sure what the fate of this device will be: I'll bring it to the next Montreal Debian & Stuff to see if anyone's interested, contact me if you can't show up and want this thing.

Matthew Palmer: Why Certificate Lifecycle Automation Matters

If you ve perused the ActivityPub feed of certificates whose keys are known to be compromised, and clicked on the Show More button to see the name of the certificate issuer, you may have noticed that some issuers seem to come up again and again. This might make sense after all, if a CA is issuing a large volume of certificates, they ll be seen more often in a list of compromised certificates. In an attempt to see if there is anything that we can learn from this data, though, I did a bit of digging, and came up with some illuminating results.

The Procedure I started off by finding all the unexpired certificates logged in Certificate Transparency (CT) logs that have a key that is in the pwnedkeys database as having been publicly disclosed. From this list of certificates, I removed duplicates by matching up issuer/serial number tuples, and then reduced the set by counting the number of unique certificates by their issuer. This gave me a list of the issuers of these certificates, which looks a bit like this:
/C=BE/O=GlobalSign nv-sa/CN=AlphaSSL CA - SHA256 - G4
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Domain Validation Secure Server CA
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Organization Validation Secure Server CA
/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository//CN=Starfield Secure Certificate Authority - G2
/C=AT/O=ZeroSSL/CN=ZeroSSL RSA Domain Secure Site CA
/C=BE/O=GlobalSign nv-sa/CN=GlobalSign GCC R3 DV TLS CA 2020
Rather than try to work with raw issuers (because, as Andrew Ayer says, The SSL Certificate Issuer Field is a Lie), I mapped these issuers to the organisations that manage them, and summed the counts for those grouped issuers together.

The Data
Lieutenant Commander Data from Star Trek: The Next Generation Insert obligatory "not THAT data" comment here
The end result of this work is the following table, sorted by the count of certificates which have been compromised by exposing their private key:
IssuerCompromised Count
Sectigo170
ISRG (Let's Encrypt)161
GoDaddy141
DigiCert81
GlobalSign46
Entrust3
SSL.com1
If you re familiar with the CA ecosystem, you ll probably recognise that the organisations with large numbers of compromised certificates are also those who issue a lot of certificates. So far, nothing particularly surprising, then. Let s look more closely at the relationships, though, to see if we can get more useful insights.

Volume Control Using the issuance volume report from crt.sh, we can compare issuance volumes to compromise counts, to come up with a compromise rate . I m using the Unexpired Precertificates colume from the issuance volume report, as I feel that s the number that best matches the certificate population I m examining to find compromised certificates. To maintain parity with the previous table, this one is still sorted by the count of certificates that have been compromised.
IssuerIssuance VolumeCompromised CountCompromise Rate
Sectigo88,323,0681701 in 519,547
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
GoDaddy56,121,4291411 in 398,024
DigiCert144,713,475811 in 1,786,586
GlobalSign1,438,485461 in 31,271
Entrust23,16631 in 7,722
SSL.com171,81611 in 171,816
If we now sort this table by compromise rate, we can see which organisations have the most (and least) leakiness going on from their customers:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
Sectigo88,323,0681701 in 519,547
DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
By grouping by order-of-magnitude in the compromise rate, we can identify three bands :
  • The Super Leakers: Customers of Entrust and GlobalSign seem to love to lose control of their private keys. For Entrust, at least, though, the small volumes involved make the numbers somewhat untrustworthy. The three compromised certificates could very well belong to just one customer, for instance. I m not aware of anything that GlobalSign does that would make them such an outlier, either, so I m inclined to think they just got unlucky with one or two customers, but as CAs don t include customer IDs in the certificates they issue, it s not possible to say whether that s the actual cause or not.
  • The Regular Leakers: Customers of SSL.com, GoDaddy, and Sectigo all have compromise rates in the 1-in-hundreds-of-thousands range. Again, the low volumes of SSL.com make the numbers somewhat unreliable, but the other two organisations in this group have large enough numbers that we can rely on that data fairly well, I think.
  • The Low Leakers: Customers of DigiCert and Let s Encrypt are at least three times less likely than customers of the regular leakers to lose control of their private keys. Good for them!
Now we have some useful insights we can think about.

Why Is It So?
Professor Julius Sumner Miller If you don't know who Professor Julius Sumner Miller is, I highly recommend finding out
All of the organisations on the list, with the exception of Let s Encrypt, are what one might term traditional CAs. To a first approximation, it s reasonable to assume that the vast majority of the customers of these traditional CAs probably manage their certificates the same way they have for the past two decades or more. That is, they generate a key and CSR, upload the CSR to the CA to get a certificate, then copy the cert and key somewhere. Since humans are handling the keys, there s a higher risk of the humans using either risky practices, or making a mistake, and exposing the private key to the world. Let s Encrypt, on the other hand, issues all of its certificates using the ACME (Automatic Certificate Management Environment) protocol, and all of the Let s Encrypt documentation encourages the use of software tools to generate keys, issue certificates, and install them for use. Given that Let s Encrypt has 161 compromised certificates currently in the wild, it s clear that the automation in use is far from perfect, but the significantly lower compromise rate suggests to me that lifecycle automation at least reduces the rate of key compromise, even though it doesn t eliminate it completely.

Explaining the Outlier The difference in presumed issuance practices would seem to explain the significant difference in compromise rates between Let s Encrypt and the other organisations, if it weren t for one outlier. This is a largely traditional CA, with the manual-handling issues that implies, but with a compromise rate close to that of Let s Encrypt. We are, of course, talking about DigiCert. The thing about DigiCert, that doesn t show up in the raw numbers from crt.sh, is that DigiCert manages the issuance of certificates for several of the biggest hosted TLS providers, such as CloudFlare and AWS. When these services obtain a certificate from DigiCert on their customer s behalf, the private key is kept locked away, and no human can (we hope) get access to the private key. This is supported by the fact that no certificates identifiably issued to either CloudFlare or AWS appear in the set of certificates with compromised keys. When we ask for all certificates issued by DigiCert , we get both the certificates issued to these big providers, which are very good at keeping their keys under control, as well as the certificates issued to everyone else, whose key handling practices may not be quite so stringent. It s possible, though not trivial, to account for certificates issued to these hosted TLS providers, because the certificates they use are issued from intermediates branded to those companies. With the crt.sh psql interface we can run this query to get the total number of unexpired precertificates issued to these managed services:
SELECT SUM(sub.NUM_ISSUED[2] - sub.NUM_EXPIRED[2])
  FROM (
    SELECT ca.name, max(coalesce(coalesce(nullif(trim(cc.SUBORDINATE_CA_OWNER), ''), nullif(trim(cc.CA_OWNER), '')), cc.INCLUDED_CERTIFICATE_OWNER)) as OWNER,
           ca.NUM_ISSUED, ca.NUM_EXPIRED
      FROM ccadb_certificate cc, ca_certificate cac, ca
     WHERE cc.CERTIFICATE_ID = cac.CERTIFICATE_ID
       AND cac.CA_ID = ca.ID
  GROUP BY ca.ID
  ) sub
 WHERE sub.name ILIKE '%Amazon%' OR sub.name ILIKE '%CloudFlare%' AND sub.owner = 'DigiCert';
The number I get from running that query is 104,316,112, which should be subtracted from DigiCert s total issuance figures to get a more accurate view of what DigiCert s regular customers do with their private keys. When I do this, the compromise rates table, sorted by the compromise rate, looks like this:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
"Regular" DigiCert40,397,363811 in 498,732
Sectigo88,323,0681701 in 519,547
All DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
In short, it appears that DigiCert s regular customers are just as likely as GoDaddy or Sectigo customers to expose their private keys.

What Does It All Mean? The takeaway from all this is fairly straightforward, and not overly surprising, I believe.

The less humans have to do with certificate issuance, the less likely they are to compromise that certificate by exposing the private key. While it may not be surprising, it is nice to have some empirical evidence to back up the common wisdom. Fully-managed TLS providers, such as CloudFlare, AWS Certificate Manager, and whatever Azure s thing is called, is the platonic ideal of this principle: never give humans any opportunity to expose a private key. I m not saying you should use one of these providers, but the security approach they have adopted appears to be the optimal one, and should be emulated universally. The ACME protocol is the next best, in that there are a variety of standardised tools widely available that allow humans to take themselves out of the loop, but it s still possible for humans to handle (and mistakenly expose) key material if they try hard enough. Legacy issuance methods, which either cannot be automated, or require custom, per-provider automation to be developed, appear to be at least four times less helpful to the goal of avoiding compromise of the private key associated with a certificate.

Humans Are, Of Course, The Problem
Bender, the robot from Futurama, asking if we'd like to kill all humans No thanks, Bender, I'm busy tonight
This observation that if you don t let humans near keys, they don t get leaked is further supported by considering the biggest issuers by volume who have not issued any certificates whose keys have been compromised: Google Trust Services (fourth largest issuer overall, with 57,084,529 unexpired precertificates), and Microsoft Corporation (sixth largest issuer overall, with 22,852,468 unexpired precertificates). It appears that somewhere between most and basically all of the certificates these organisations issue are to customers of their public clouds, and my understanding is that the keys for these certificates are managed in same manner as CloudFlare and AWS the keys are locked away where humans can t get to them. It should, of course, go without saying that if a human can never have access to a private key, it makes it rather difficult for a human to expose it. More broadly, if you are building something that handles sensitive or secret data, the more you can do to keep humans out of the loop, the better everything will be.

Your Support is Appreciated If you d like to see more analysis of how key compromise happens, and the lessons we can learn from examining billions of certificates, please show your support by buying me a refreshing beverage. Trawling CT logs is thirsty work.

Appendix: Methodology Limitations In the interests of clarity, I feel it s important to describe ways in which my research might be flawed. Here are the things I know of that may have impacted the accuracy, that I couldn t feasibly account for.
  • Time Periods: Because time never stops, there is likely to be some slight mismatches in the numbers obtained from the various data sources, because they weren t collected at exactly the same moment.
  • Issuer-to-Organisation Mapping: It s possible that the way I mapped issuers to organisations doesn t match exactly with how crt.sh does it, meaning that counts might be skewed. I tried to minimise that by using the same data sources (the CCADB AllCertificates report) that I believe that crt.sh uses for its mapping, but I cannot be certain of a perfect match.
  • Unwarranted Grouping: I ve drawn some conclusions about the practices of the various organisations based on their general approach to certificate issuance. If a particular subordinate CA that I ve grouped into the parent organisation is managed in some unusual way, that might cause my conclusions to be erroneous. I was able to fairly easily separate out CloudFlare, AWS, and Azure, but there are almost certainly others that I didn t spot, because hoo boy there are a lot of intermediate CAs out there.

29 January 2024

Michael Ablassmeier: qmpbackup 0.28

Over the last weekend i had some spare time to improve qmpbackup a little more, the new version: and some minor code reworks. Hope its useful for someone.

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book): I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task. task <id> done marks it as done. So far so good, so a redacted version of my to-do list looks like this:
$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail
Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:
$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).
When I stop work on something, I do task active to find the ID, then task <id> stop. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):
#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service
I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):
server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem
Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

16 January 2024

Matthew Palmer: Pwned Certificates on the Fediverse

As well as the collection and distribution of compromised keys, the pwnedkeys project also matches those pwned keys against issued SSL certificates. I m excited to announce that, as of the beginning of 2024, all matched certificates are now being published on the Fediverse, thanks to the botsin.space Mastodon server. Want to know which sites are susceptible to interception and interference, in (near-)real time? Do you have a burning desire to know who is issuing certificates to people that post their private keys in public? Now you can.

How It Works The process for publishing pwned certs is, roughly, as follows:
  1. All the certificates in Certificate Transparency (CT) logs are hoovered up (using my scrape-ct-log tool, the fastest log scraper in the west!), and the fingerprint of the public key of each certificate is stored in an LMDB datafile.
  2. As new private keys are identified as having been compromised, the fingerprint of that key is checked against all the LMDB files, which map key fingerprints to certificates (actually to CT log entry IDs, from which the certificates themselves are retrieved).
  3. If one or more matches are found, then the certificates using the compromised key are forwarded to the tooter , which publishes them for the world to marvel at.
This makes it sound all very straightforward, and it is in theory. The trick comes in optimising the pipeline so that the five million or so new certificates every day can get indexed on the one slightly middle-aged server I ve got, without getting backlogged.

Why Don t You Just Have the Certificates Revoked? Funny story about that I used to notify CAs of certificates they d issued using compromised keys, which had the effect of requiring them to revoke the associated certificates. However, several CAs disliked having to revoke all those certificates, because it cost them staff time (and hence money) to do so. They went so far as to change their procedures from the standard way of accepting problem reports (emailing a generic attestation of compromise), and instead required CA-specific hoop-jumping to notify them of compromised keys. Since the effectiveness of revocation in the WebPKI is, shall we say, homeopathic at best, I decided I couldn t be bothered to play whack-a-mole with CAs that just wanted to be difficult, and I stopped sending compromised key notifications to CAs. Instead, now I m publishing the details of compromised certificates to everyone, so that users can protect themselves directly should they choose to.

Further Work The astute amongst you may have noticed, in the above How It Works description, a bit of a gap in my scanning coverage. CAs can (and do!) issue certificates for keys that are already compromised, including weak keys that have been known about for a decade or more (1, 2, 3). However, as currently implemented, the pwnedkeys certificate checker does not automatically find such certificates. My plan is to augment the CT scraping / cert processing pipeline to check all incoming certificates against the existing (2M+) set of pwned keys. Though, with over five million new certificates to check every day, it s not necessarily as simple as just hit the pwnedkeys API for every new cert . The poor old API server might not like that very much.

Support My Work If you d like to see this extra matching happen a bit quicker, I ve setup a ko-fi supporters page, where you can support my work on pwnedkeys and the other open source software and projects I work on by buying me a refreshing beverage. I would be very appreciative, and your support lets me know I should do more interesting things with the giant database of compromised keys I ve accumulated.

15 January 2024

Russ Allbery: Review: The Library of Broken Worlds

Review: The Library of Broken Worlds, by Alaya Dawn Johnson
Publisher: Scholastic Press
Copyright: June 2023
ISBN: 1-338-29064-9
Format: Kindle
Pages: 446
The Library of Broken Worlds is a young-adult far-future science fantasy. So far as I can tell, it's stand-alone, although more on that later in the review. Freida is the adopted daughter of Nadi, the Head Librarian, and her greatest wish is to become a librarian herself. When the book opens, she's a teenager in highly competitive training. Freida is low-wetware, without the advanced and expensive enhancements of many of the other students competing for rare and prized librarian positions, which she makes up for by being the most audacious. She doesn't need wetware to commune with the library material gods. If one ventures deep into their tunnels and consumes their crystals, direct physical communion is possible. The library tunnels are Freida's second home, in part because that's where she was born. She was created by the Library, and specifically by Iemaja, the youngest of the material gods. Precisely why is a mystery. To Nadi, Freida is her daughter. To Quinn, Nadi's main political rival within the library, Freida is a thing, a piece of the library, a secondary and possibly rogue AI. A disruptive annoyance. The Library of Broken Worlds is the sort of science fiction where figuring out what is going on is an integral part of the reading experience. It opens with a frame story of an unnamed girl (clearly Freida) waking the god Nameren and identifying herself as designed for deicide. She provokes Nameren's curiosity and offers an Arabian Nights bargain: if he wants to hear her story, he has to refrain from killing her for long enough for her to tell it. As one might expect, the main narrative doesn't catch up to the frame story until the very end of the book. The Library is indeed some type of library that librarians can search for knowledge that isn't available from more mundane sources, but Freida's personal experience of it is almost wholly religious and oracular. The library's material gods are identified as AIs, but good luck making sense of the story through a science fiction frame, even with a healthy allowance for sufficiently advanced technology being indistinguishable from magic. The symbolism and tone is entirely fantasy, and late in the book it becomes clear that whatever the material gods are, they're not simple technological AIs in the vein of, say, Banks's Ship Minds. Also, the Library is not solely a repository of knowledge. It is the keeper of an interstellar peace. The Library was founded after the Great War, to prevent a recurrence. It functions as a sort of legal system and grand tribunal in ways that are never fully explained. As you might expect, that peace is based more on stability than fairness. Five of the players in this far future of humanity are the Awilu, the most advanced society and the first to leave Earth (or Tierra as it's called here); the Mah m, who possess the material war god Nameren of the frame story; the Lunars and Martians, who dominate the Sol system; and the surviving Tierrans, residents of a polluted and struggling planet that is ruthlessly exploited by the Lunars. The problem facing Freida and her friends at the start of the book is a petition brought by a young Tierran against Lunar exploitation of his homeland. His name is Joshua, and Freida is more than half in love with him. Joshua's legal argument involves interpretation of the freedom node of the treaty that ended the Great War, a node that precedent says gives the Lunars the freedom to exploit Tierra, but which Joshua claims has a still-valid originalist meaning granting Tierrans freedom from exploitation. There is, in short, a lot going on in this book, and "never fully explained" is something of a theme. Freida is telling a story to Nameren and only explains things Nameren may not already know. The reader has to puzzle out the rest from the occasional hint. This is made more difficult by the tendency of the material gods to communicate only in visions or guided hallucinations, full of symbolism that the characters only partly explain to the reader. Nonetheless, this did mostly work, at least for me. I started this book very confused, but by about the midpoint it felt like the background was coming together. I'm still not sure I understand the aurochs, baobab, and cicada symbolism that's so central to the framing story, but it's the pleasant sort of stretchy confusion that gives my brain a good workout. I wish Johnson had explained a few more things plainly, particularly near the end of the book, but my remaining level of confusion was within my tolerances. Unfortunately, the ending did not work for me. The first time I read it, I had no idea what it meant. Lots of baffling, symbolic things happened and then the book just stopped. After re-reading the last 10%, I think all the pieces of an ending and a bit of an explanation are there, but it's absurdly abbreviated. This is another book where the author appears to have been finished with the story before I was. This keeps happening to me, so this probably says something more about me than it says about books, but I want books to have an ending. If the characters have fought and suffered through the plot, I want them to have some space to be happy and to see how their sacrifices play out, with more detail than just a few vague promises. If much of the book has been puzzling out the nature of the world, I would like some concrete confirmation of at least some of my guesswork. And if you're going to end the book on radical transformation, I want to see the results of that transformation. Johnson does an excellent job showing how brutal the peace of the powerful can be, and is willing to light more things on fire over the course of this book than most authors would, but then doesn't offer the reader much in the way of payoff. For once, I wish this stand-alone turned out to be a series. I think an additional book could be written in the aftermath of this ending, and I would definitely read that novel. Johnson has me caring deeply about these characters and fascinated by the world background, and I'd happily spend another 450 pages finding out what happens next. But, frustratingly, I think this ending was indeed intended to wrap up the story. I think this book may fall between a few stools. Science fiction readers who want mysterious future worlds to be explained by the end of the book are going to be frustrated by the amount of symbolism, allusion, and poetic description. Literary fantasy readers, who have a higher tolerance for that style, are going to wish for more focused and polished writing. A lot of the story is firmly YA: trying and failing to fit in, developing one's identity, coming into power, relationship drama, great betrayals and regrets, overcoming trauma and abuse, and unraveling lies that adults tell you. But this is definitely not a straight-forward YA plot or world background. It demands a lot from the reader, and while I am confident many teenage readers would rise to that challenge, it seems like an awkward fit for the YA marketing category. About 75% of the way in, I would have told you this book was great and you should read it. The ending was a let-down and I'm still grumpy about it. I still think it's worth your attention if you're in the mood for a sink-or-swim type of reading experience. Just be warned that when the ride ends, I felt unceremoniously dumped on the pavement. Content warnings: Rape, torture, genocide. Rating: 7 out of 10

11 January 2024

Reproducible Builds: Reproducible Builds in December 2023

Welcome to the December 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds: Increasing the Integrity of Software Supply Chains awarded IEEE Software Best Paper award In February 2022, we announced in these reports that a paper written by Chris Lamb and Stefano Zacchiroli was now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF). This month, however, IEEE Software announced that this paper has won their Best Paper award for 2022.

Reproducibility to affect package migration policy in Debian In a post summarising the activities of the Debian Release Team at a recent in-person Debian event in Cambridge, UK, Paul Gevers announced a change to the way packages are migrated into the staging area for the next stable Debian release based on its reproducibility status:
The folks from the Reproducibility Project have come a long way since they started working on it 10 years ago, and we believe it s time for the next step in Debian. Several weeks ago, we enabled a migration policy in our migration software that checks for regression in reproducibility. At this moment, that is presented as just for info, but we intend to change that to delays in the not so distant future. We eventually want all packages to be reproducible. To stimulate maintainers to make their packages reproducible now, we ll soon start to apply a bounty [speedup] for reproducible builds, like we ve done with passing autopkgtests for years. We ll reduce the bounty for successful autopkgtests at that moment in time.

Speranza: Usable, privacy-friendly software signing Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman have developed a new system called Speranza, which is aimed at reassuring software consumers that the product they are getting has not been tampered with and is coming directly from a source they trust. A write-up on TechXplore.com goes into some more details:
What we have done, explains Sollins, is to develop, prove correct, and demonstrate the viability of an approach that allows the [software] maintainers to remain anonymous. Preserving anonymity is obviously important, given that almost everyone software developers included value their confidentiality. This new approach, Sollins adds, simultaneously allows [software] users to have confidence that the maintainers are, in fact, legitimate maintainers and, furthermore, that the code being downloaded is, in fact, the correct code of that maintainer. [ ]
The corresponding paper is published on the arXiv preprint server in various formats, and the announcement has also been covered in MIT News.

Nondeterministic Git bundles Paul Baecher published an interesting blog post on Reproducible git bundles. For those who are not familiar with them, Git bundles are used for the offline transfer of Git objects without an active server sitting on the other side of a network connection. Anyway, Paul wrote about writing a backup system for his entire system, but:
I noticed that a small but fixed subset of [Git] repositories are getting backed up despite having no changes made. That is odd because I would think that repeated bundling of the same repository state should create the exact same bundle. However [it] turns out that for some, repositories bundling is nondeterministic.
Paul goes on to to describe his solution, which involves forcing git to be single threaded makes the output deterministic . The article was also discussed on Hacker News.

Output from libxlst now deterministic libxslt is the XSLT C library developed for the GNOME project, where XSLT itself is an XML language to define transformations for XML files. This month, it was revealed that the result of the generate-id() XSLT function is now deterministic across multiple transformations, fixing many issues with reproducible builds. As the Git commit by Nick Wellnhofer describes:
Rework the generate-id() function to return deterministic values. We use
a simple incrementing counter and store ids in the 'psvi' member of
nodes which was freed up by previous commits. The presence of an id is
indicated by a new "source node" flag.
This fixes long-standing problems with reproducible builds, see
https://bugzilla.gnome.org/show_bug.cgi?id=751621
This also hardens security, as the old implementation leaked the
difference between a heap and a global pointer, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1356211
The old implementation could also generate the same id for dynamically
created nodes which happened to reuse the same memory. Ids for namespace
nodes were completely broken. They now use the id of the parent element
together with the hex-encoded namespace prefix.

Community updates There were made a number of improvements to our website, including Chris Lamb fixing the generate-draft script to not blow up if the input files have been corrupted today or even in the past [ ], Holger Levsen updated the Hamburg 2023 summit to add a link to farewell post [ ] & to add a picture of a Post-It note. [ ], and Pol Dellaiera updated the paragraph about tar and the --clamp-mtime flag [ ]. On our mailing list this month, Bernhard M. Wiedemann posted an interesting summary on some of the reasons why packages are still not reproducible in 2023. diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including processing objdump symbol comment filter inputs as Python byte (and not str) instances [ ] and Vagrant Cascadian extended diffoscope support for GNU Guix [ ] and updated the version in that distribution to version 253 [ ].

Challenges of Producing Software Bill Of Materials for Java Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, C sar Soto-Valero and Martin Wittlinger (!) of the KTH Royal Institute of Technology in Sweden, have published an article in which they:
deep-dive into 6 tools and the accuracy of the SBOMs they produce for complex open-source Java projects. Our novel insights reveal some hard challenges regarding the accurate production and usage of software bills of materials.
The paper is available on arXiv.

Debian Non-Maintainer campaign As mentioned in previous reports, the Reproducible Builds team within Debian has been organising a series of online and offline sprints in order to clear the huge backlog of reproducible builds patches submitted by performing so-called NMUs (Non-Maintainer Uploads). During December, Vagrant Cascadian performed a number of such uploads, including: In addition, Holger Levsen performed three no-source-change NMUs in order to address the last packages without .buildinfo files in Debian trixie, specifically lorene (0.0.0~cvs20161116+dfsg-1.1), maria (1.3.5-4.2) and ruby-rinku (1.7.3-2.1).

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In December, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Fix matching packages for the [R programming language](https://en.wikipedia.org/wiki/R_(programming_language). [ ][ ][ ]
    • Add a Certbot configuration for the Nginx web server. [ ]
    • Enable debugging for the create-meta-pkgs tool. [ ][ ]
  • Arch Linux-related changes
    • The asp has been deprecated by pkgctl; thanks to dvzrv for the pointer. [ ]
    • Disable the Arch Linux builders for now. [ ]
    • Stop referring to the /trunk branch / subdirectory. [ ]
    • Use --protocol https when cloning repositories using the pkgctl tool. [ ]
  • Misc changes:
    • Install the python3-setuptools and swig packages, which are now needed to build OpenWrt. [ ]
    • Install pkg-config needed to build Coreboot artifacts. [ ]
    • Detect failures due to an issue where the fakeroot tool is implicitly required but not automatically installed. [ ]
    • Detect failures due to rename of the vmlinuz file. [ ]
    • Improve the grammar of an error message. [ ]
    • Document that freebsd-jenkins.debian.net has been updated to FreeBSD 14.0. [ ]
In addition, node maintenance was performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.

Previous.